x86
--------------------------------------------------------------------------------
  - a family of inst. sets (not a single one), extended with back. compat.
  - mainly for desktops
  - arch.: register-memory
  - CISC, 3000+ instructions total (~1000 mnemonics)
  - variable instruction size (usually 2-3 bytes)
  - little endian
  - partially "open"
  - 16, 32, 64 bit versions
  - supports (usually as extension) float, SIMD, MMX, SSE, ...
  - modes:
    - real: 20b segmented addr. (~1 Mib RAM), no mem. protection
    - unreal: weird
    - protected: 16 MB (1 GB) of physical (virtual) RAM, protected memory
    - long
    - ...
  - bloat, implementations use speculation, reordering, prediction, microcode,
      pipelines etc.
  - registers:
                  general purpose registers:

      64b | RAX      | RBX      | RCX      | RDX      |
      32b | | EAX    | | EBX    | | CAX    | | EDX    |
      16b | | | AX   | | | BX   | | | CX   | | | DX   |
       8b | | |AH AL | | |BH BL | | |CH CL | | |DH DL |

      other registers:

      (E/R)FLAGS  <- various flags set by operations
        CF    PF     AF   ZF   SF   TF   IF   DF   OF ...
        carry parity aux. zero sign trap int. dir. overflow 

      (E/R)SP     <- stack pointer
      (E/R)BP     <- stack base pointer
      (E/R)IP     <- instruciton poitner

      (E/R)SI     <- source pointer
      (E/R)DI     <- destination pointer

      CS          <- code pointer     \
      DS          <- data pointer     | segment
      SS          <- stack            | registers
      ES,FS,GS    <- extra pointer    /

  - instruction format:

    - 0 to 4 prefix bytes modifying the instruction
    - 1 to 2 bytes opcode identifying the instruction
    - 0 to 1 bytes describing the operands (memory/registers)
    - 0 to 1 bytes of a weird "scaled index byte"
    - 0 to 4 memory displacement bytes, specify the address offset
    - 0 to 4 immediate bytes, specify a constant value

  - basic instructions:
   
    ADD               add
    ADC               add with carry
    CALL              call procedure (pushes EIP and jumps)
    DEC               decrement
    DIV               unsigned divide
    IDIV              signed divide
    IMUL              signed multiply
    INC               increment
    JNE, JNZ, JZ, ... jump if condition (not equal, not zero, zero, ...)
    JMP               unconditional jump
    MOV               move (copy data)
    MUL               unsigned multiply
    NEG               negation (two's complement)
    NOP               no operation
    POP               pop from stack
    PUSH              push onto stack
    ROL               rotate left
    SHR               shift right

ARM (advanced RISC machines)
--------------------------------------------------------------------------------
  - family of instruction sets (ARMv1, ARMv2, ARMv3, ...)
  - mainly for embedded, simple, low energy sonsumption and heat
  - arch.: load-store
  - "proprietary"
  - fixed instr. length (32b), BUT there is also a Thumb subset that encodes
    instrs. as 16b (smaller code but fewer instructions), and Thumb2 (variable
    instr. size)
  - little endian, can be switched to big
  - RISC, 232 instructions (~50 mnemonics)
  - 32b, 64b
  - mostly 1 CPI
  - modes:
    - user: unpriviledged (can't do certain things)
    - supervisor: priviledged
    - undefined: after undefined inst.
    - abort: after memory access violation
    - ...
  - doesn't have divide instruction!
  - implementaitons don't use microcode, are often simple without caches etc.
  - instruction format:
    
    | operand  |dst|src||opc||0|co |
    |    2     |reg|reg||ode||0| nd|
    --------........--------........
 
      Almost all instruciton can have a condition.

  - registers:
      - all 32 bit
      - general purpose: R0 - R12
      - stack pointer: R13
      - link register: R14 (function return address)
      - program counter: R15
      - flags: CPSR (CPU mode, thumb, endian, zero, carry, ...)

  - basic instructions:
   
    ADC               add with carry
    ADD               add
    AND               and operation
    B, BNE, BEQ, ...  branch if (always, not equal, equal, ...)
    CMP               compare
    LDR               load memory to register
    MOV               move register/constant to register
    MUL               multiply
    STR               store register to memory
    SWI               software interrupt

RISC-V
--------------------------------------------------------------------------------

POWER PC
--------------------------------------------------------------------------------

AVR
--------------------------------------------------------------------------------
  - by Atmel, for embedded
  - arch.: load-store
  - RISC, ~120 instructions
  - 8 bit
  - Harward arhitecture (separate instruction and data memory)
  - instruction format:
    - 16 bit (but some which have long addresses are 32 bit)
    - the format differs between instructions (opcode is in different places,
      of different size etc.)
  - doesn't have divide instruction!
  - registers:
    - most 8 bit
    - general purpose: R0 - R31
    - addressing: X (R27,R26), Y (R29,R28), Z (R31,R30)
    - program counter: PC (16 or 22 bit)
    - stack pointer: SP (8 or 16 bit)
    - flags (status): SREG (carry, zero, negative, overflow, sign, half-carry,
       bit copy, interrupt)
  - basic instructions:
   
    ADC              add with carry
    ADD              add without carry
    AND              logical and
    BRBC             branch if SREG bit is set, jump if specified SREG bit is 1
    BRGE             branch if >= (signed), branches if sign flag is 0
    BRSH             branch if >= (unsigned), branches if carry flag is 0
    CP               compare, only sets flags
    INC              increment
    JMP              jump (long, to any address)
    LDI              load immediate 8 bit value to register
    LDS              load direct from data space, loads 8 bits from memory
    LPM              load program memory, loads 8 bits from program memory
    MOV              move register to register
    MUL              multiply unsigned (16 bit result)
    MULS             multiply signed (16 bit result)
    MULSU            multiply signed with unsigned (signed 16 bit result)
    NEG              two's complement negation
    NOP              no operation
    SBRC             skip if register bit is 0, conditionally skips next inst.
    SUB              substract

Java bytecode
--------------------------------------------------------------------------------
  - stack/register architecture
  - 202 opcodes
  - in each function there is a stack (arguments, computation, return value) and
    local variable array (same as registers)
  - variable instructions size: 1B opcode and 1 to N operands
  - has objects

  - basic instructions:
  
    arraylength         pushes length of array reference on top of stack
    breakpoint          beak point for debuggers
    f2i                 converts float on top of stack to int
    goto                jump
    goto_w              longer jump
    iadd                pushes result of addition of 2 ints on stack top
    iand                performs bitwise and on 2 ints on stack top
    iconst_m1           loads -1 on top of stack
    idiv                divides two integers on top of stack
    ifeq                if top of stack is 0, branch to given address
    iload_0             load int local variable # 0 on top of stack
    new                 create new object of class of given ID
    newarray            create new array of given length
    nop                 no operaion
    pop                 pop top of stack
    putfield            set given field of given object
    return              return void from function
 
Python bytecode
--------------------------------------------------------------------------------
  - not official, just an implementaion detail, and differs between Py versions
  - 2 bytes per instruction (1B opcode, 1B argument)
  - evaluation stack contains abstract object just like python (numbers, lists,
    objects, ...)
  - basic instructions:

    BINARY_ADD          adds 2 top stack items and pushes result
    BINARY_MULTIPLY     multiplies 2 top stack items and pushes result
    CALL_FUNCTION       passes N args from stack top and calls func below them
    EXTEND_ARGS         for arguments bigger than 1 byte
    GET_LEN             pushes len() of top of the stack
    JUMP_FORWARD        unconditionally jump to address
    LIST_APPEND         appends stack top to stack 2nd list
    LOAD_CONST          loads constant on top of stack
    NOP                 no operation (placeholder for optimizer)
    POP_JUMP_IF_TRUE    pops and conditionally jumps to given address
    POP_TOP             pop top of the stack
    RETURN_VALUE        returns value to the caller
    ROT_TWO             swaps two top items in the stack
    UNARY_NOT           negates top of the stack

LLVM
-------------------------------------------------------------------------------- 
  - intermediate representation for compilers
  - RISC
  - strongly typed
  - kind of bloat, many "features"
  - abstracts things like calling conventions and modules (but programs
    compiled to this from languages may be not 100% target-independent because
    of things like sizeof())
  - registers: 
    - infinitely many tmp. registers (%0, %1, ...)
  - basic instructions:

    add                    add numbers of identical types
    br                     branch within function (either conditional or not)
    alloca                 allocates stack memory (and auto deallocates later)
    call                   call a function
    fadd                   add float numbers
    icmp                   compare and return a binary result
    mul                    multiplay two numbers of same type (gives same type)
    ret                    return from function
    switch                 switch (like in C)
